-
-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix court string matching with whitespace #144
Fix court string matching with whitespace #144
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
My understanding is that in this comment #135 (comment), @flooie eliminated duplicate abbreviations from courts DB, so now it's safe to do things like this.
@flooie, wil you give this a quick review too? The only changes you need to look at are in resolve.py, the rest is updating Black.
Assuming this is good, we need to return to:
- Court name issues #129 (might be obsolete)
- Court name errors #128 (might be fixed)
- Getting full citation span #135 (don't we already support this, regardless of the code here?)
@flooie looks like this one never got your approval. Mind taking a look, please? |
Want me to rebase this? |
That'd be great, thanks @mattdahl |
94b1e2f
to
96dcc84
Compare
2d0f7cb
to
c9b4d78
Compare
Thanks. I'll take a look soon. |
I added a few lines because if you test it with "Commonwealth v. Muniz, 164 A.3d 1189 (Pa. 2017)" as mentioned in the issue, it now returns "njcirctpassaic" instead of "pa", this happens because the first court in the list returned by courts-db that starts with "Pa" is "Passaic Cty. Cir. Ct., N.J." I added a check to look for an exact match before try the startswith approach, for this specific case this approach works. |
Thanks, I agree that looking for an exact match first is good (cf. #129 (comment)). But can we refactor this so we only iterate over Something like:
|
great, thanks, it looks more readable, I'll push this change |
@flooie this PR is ready, i think that check "Benchmark Pull Request / PR comment (pull_request)" may not be passing because it is a branch of a fork, that is why there may not be permissions to call the workflow that adds the comment with the benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mattdahl
As discussed in #135 (comment), there is presently a bug where court strings without whitespace are not properly matched. 3b2fe09 implements a failing test for this bug. 94b1e2f implements a simple fix.
This PR is also related to the changes proposed in #129, but I think that that proposal has been made obsolete with the removal of all the duplicate citation strings by @flooie (#135 (comment)). In any case, this PR addresses a different problem re whitespace.
Note that this PR is based off of #143 (needed to update black to make GitHub Actions happy), so that should be merged first.